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The  third  experiment  demonstrated  that  high  probability  causes  quicken  sound 
identification  compared  to  low  probability  causes.  This  effect  was  foun  in 
individual  listeners. 


Zillions  have  been  spent  to  improve  long-range 

underwater  detection,  but  when  some  stumbling 

crew  member  in  a  submerged  submarine  ( lying  in  wait 

with  its  engines  shut  off)  drops  a  wrench,  only  the 

human  listener  can  identify  the  unexpected  sound 

and  draw  the  uncomfortable  conclusion  (Schroeder,  1977, 

p.  184) 


Efforts  to  understand  the  identification  or 
classification  of  signals  have  in  recent  years  focused  on 
feature-analysis  models  (Bisson,  1981;  Getty,  Swets,  &  Swets, 
1981;  Howard  &  Balias,  1983).  These  models  have  as  an 
integral  component  psychophysical  functions  which  map  the 
multiple  physical  features  of  the  signal  into  a 
multidimensional  perceptual  space  (i.e.,  Figure  1).  Decision 
algorithms  are  employed  to  partition  the  perceptual  space 
into  the  categories  of  interest.  A  probabilistic  decision 
algorithm  is  often  used  when  a  category  can  have  members 
which  are  similar  to  the  members  of  other  categories.  This 
could  occur  in  the  case  of  sound  classification  under  the 
following  conditions.  Assume  that  the  categories  are  types 
of  events  and  the  sounds  are  examples  of  these  events. 

Assume  also  that  the  sound  effects  of  some  events  are  similar 
to  the  effects  of  other,  dissimilar  events.  To  classify  a 
sound  in  this  case,  the  conditional  probability  of  a 
particular  cause  given  that  a  certain  sound  has  occurred — 
p(cjs) — must  be  determined.  Howard  and  Balias  (1983)  used 
Bayes'  rule  to  estimate  this  conditional  probability  from  the 
conditional  probabilities  of  the  sound  given  the  cause — 
p(sjc) — relative  to  the  conditional  probability  of  the  sound 
given  all  other  causes.  Getty  et  al.  (1981)  estimated  this 
conditional  probability  on  the  basis  of  the  confusability  of 
the  stimulus  with  other  stimuli. 

From  the  listener's  perspective,  the  situation  is  as 
illustrated  in  Figure  2  which  suggests  that  the  sound  is 
ambiguous  because  it  could  have  several  causes.  The 
listener's  task  is  to  use  the  information  presented  by  the 
stimulus  and  decide  upon  the  likely  cause  from  a  set  of 
causes.  Both  Howard  and  Balias  (1983)  and  Getty  et  al. 

(1981)  derive  the  conditional  relationships  in  Figure  2  from 
probabilistic  transformations  of  the  relationships 
illustrated  in  Figure  1.  This  strategy  requires  a  comparison 
of  the  stimuli  to  one  another,  a  strategy  available  to  the 
listener  only  after  experiencing  the  complete  stimulus  set. 
This  indirect  derivation  of  the  conditional  probability  of  a 
cause  given  a  sound  might  be  unnecessary  if  one  could 
directly  estimate  the  conditional  probability.  A  technique 
to  directly  estimate  this  conditional  probability  is 
presented  in  this  report.  This  technique  produces  a  measure 
of  causal  uncertainty  based  upon  probabilities  analogous  to 
the  conditional  probabilities  of  Figure  2. 
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Direct  estimates  of  p(c|s)  were  developed  in  order  to 
investigate  the  hypothesis  that  the  identification  of 
isolated,  non-speech,  environmental  sounds  depends  in  part 
upon  the  number  of  potential  causes  of  the  sound — causal 
uncertainty — as  illustrated  in  Figure  2.  This  hypothesis  is 
a  logical  extension  of  the  general  finding  that  accurate 
identification  of  isolated  sounds  is  possible  if  the 
properties  of  the  sound — particularly  the  temporal 
properties — are  specific  to  the  mechanical  activity  of  the 
source  (Warren  &  Verbrugge,  1984).  If  in  fact  the  sound 
properties  specify  several  types  of  events,  then 
identification  is  compromised.  This  effect  is  somewhat 
analogous  to  the  effects  of  set  size  on  choice  judgments. 

The  relationships  in  Figure  2  reflect  differences  in  the 
number  of  causes  and  the  probabilities  of  these  causes.  The 
effects  of  set  size  on  judgments  are  well  documented.  For 
example,  choice  reaction  time  is  a  function  of  the  size  of 
the  stimulus  set,  as  expressed  in  the  Hick-Hyman  law  (Hick, 
1952;  Hyman,  1953).  Although  the  effect  is  well  established, 
research  on  this  effect  has  been  limited  to  stimuli  which 
permit  a  manipulation  of  stimulus-set  size  and  to  judgments 
which  restrict  the  number  of  alternatives.  Notably  absent  is 
research  which  employs  meaningful,  naturally  occurring 
stimuli  such  as  environmental  sounds.  The  experiments  in 
this  report  take  up  the  issue  of  whether  the  Hick-Hyman  law 
applies  to  the  identification  of  environmental  sounds. 

Research  on  the  identification  of  meaningful,  non-speech 
sounds  has  focused  on  the  importance  of  particular  stimulus 
properties  (Chaney  &  Webster,  1966;  Howard,  1977;  Mackie, 
Wylie,  Ridihalgh,  Shultz,  &  Seltzer,  1981;  Talamo,  1982; 
Warren  &  Verbrugge,  1984)  or  on  the  role  of  verbal  encoding 
(Bartlett,  1977).  Yet  identification  of  this  type  of  sound 
requires  a  choice  judgment  in  instances  when  the  sound  might 
have  several  causes,  such  as  a  loud  report  heard  at  night 
(gunshot?),  near  a  highway  (backfire?),  around  the  Fourth  of 
July  (firecracker?).  The  ambiguity  of  environmental  sounds 
is  particularly  pronounced  when  taken  out  of  context  (Balias 
&  Howard,  in  press).  Some  sounds  presented  without  context 
appear  to  be  similar  to  homonymns  in  speech  and  are 
uninterpretable  without  the  context.  The  equivocal 
information  in  isolated  sounds  has  received  little  research 
effort  but  is  recognized  by  sound-effects  professionals. 
Sound-effects  records  often  contain  a  disclaimer  that  some  of 
the  sounds  in  the  record  might  be  interpreted  differently 
depending  upon  the  context  (e.g.,  Schachner,  1982).  In 
contrast,  the  equivocation  of  information  in  visual  displays 
taken  out  of  context  is  well  recognized  and  is  the  subject  of 
debate  on  the  proper  stimulus  for  perceptual  research  (Warren 
and  Shaw,  1985). 

In  examining  the  role  of  causal  uncertainty  in  sound 
identification,  the  unit  of  analysis  for  quantifying  causal 
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uncertainty  has  been  a  causal  "event".  The  choice  of  this 
unit  is  based  in  part  on  the  descriptions  of  sounds  given  in 
unconstrained  identification  experiments  (Balias  &  Howard,  in 
press;  Vanderveer,  1979).  These  descriptions  are  typically 
about  the  event  that  caused  the  sound  rather  than  the 
acoustic  characteristics  of  the  sound.  Furthermore,  the  work 
on  auditory  pattern  perception  (Bregman,  1978;  Vicario,  1982) 
has  demonstrated  that  the  perception  of  sequences  of  sounds 
is  organized  into  "streams"  of  sound  which  are  heard  to 
orginate  from  separate  sources.  The  streams  have  a  unity  and 
are  heard  as  a  kind  of  auditory  "object"  projecting  from  a 
single  source  which  has  the  characteristics  of  an  event. 

In  order  to  use  this  unit,  it  must  be  defined.  Event  is 
taken  to  mean  a  generic  spatial-temporal  process  which 
produces  acoustic  effects.  This  usage  is  consistent  with 
recent  ecological  approaches  to  perception.  For  example, 
Warren  and  Shaw  (1985)  define  an  event  as  "a  minimal  change 
in  an  energy  potential  (or  between  energy  potentials)  within 
some  intrinsically  determined  region  of  space-time"  (p.  19). 

The  generic  criterion  is  introduced  to  distinguish  the 
concept  of  an  event  from  particular  examples  of  events.  The 
notion  of  process  is  commonly  assumed  in  acoustics  but  it  is 
important  to  realize  that  the  dynamic  acoustic  pattern  acts 
as  a  reference  to  the  spatial-temporal  event  itself,  and  it 
is  the  event  itself  that  is  thought  to  be  the  cause. 

The  present  research  puts  emphasis  on  the  role  of 
potential  source  events  in  the  identification  of  sounds  and 
in  this  respect  is  closely  aligned  with  information  theory. 

In  this  theory,  the  information  metric  H  has  been  used  to 
quantify  the  amount  of  information  in  a  signal.  Despite  its 
wide  range  of  applications  and  the  amount  of  research  devoted 
to  it  during  the  1960-70s,  information  theory  now  receives 
very  little  notice  in  contemporary  research.  Luce  (1985)  has 
referred  to  information  theory  as  a  "fad"  that  has  had  little 
lasting  impact  on  psychology.  He  argues  that  the  measure 
-log  p  is  concerned  only  with  quantifying  the  amount  of 
information,  and  is  not  at  all  concerned  with  the  meaning 
conveyed  by  the  information.  And,  as  Luce  notes,  since  the 
latter  is  of  primary  interest  to  the  psychologist  who  is 
studying  information  processing,  this  particular  metric  is  of 
little  import  to  psychology. 

Posner  (1978)  also  comments  on  the  demise  of  information 
theory  in  psychology.  With  the  advent  of  information  theory, 
it  was  thought  possible  to  demonstrate  a  fixed  information 
processing  capacity  in  persons  through  the  quantification  of 
information  transmission.  Posner  claims  that  when  this 
project  failed,  information  theory  was  discarded  by 
theoretically  oriented  psychologists  since  it  could  no  longer 
provide  an  objective,  unitary  basis  for  psychological  theory. 
However,  unlike  Luce,  Posner  does  not  view  information  theory 
as  being  of  only  historical  interest.  The  uncertainty 
measure  makes  it  possible  to  represent  the  number  of  events 
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and  the  probabilities  of  these  events  in  a  single  metric. 

This  makes  it  a  useful  metric  in  appropriate  applications. 
These  applications  are  suggested  by  Garner  (1974)  in  a 
statement  which  summarizes  the  contribution  of  information 
theory:  " .  .  .  information  theory  has  provided  psychology  with 

the  basic  concept  of  information  itself,  and  it  has  clarified 
that  information  is  a  function  not  of  what  the  stimulus  is, 
but  rather  of  what  it  might  have  been,  of  its  alternatives" 

(p.  194).  The  existence  of  these  alternatives  is  an 
important  factor  in  isolated  sound  identification  and  can  be 
quantified  with  the  information  metric. 


Present  Usage  of  H 


The  critiques  of  information  theory  do  not  question  its 
ability  to  provide  a  rigorous  and  quantitative  assessment  of 
information,  they  only  (rightly)  point  out  its  inability  to 
assess  meaning  and  hence  expose  its  limited  utility  in  the 
realm  of  cognitive  psychology.  The  present  research  differs 
from  past  research  in  its  use  of  the  information  metric  in 
the  following  ways.  The  focus  of  the  present  research  is  not 
on  the  cognitive  processes  that  mediate  the  transmission  of 
information.  In  fact,  information  transmission  as  a  measure 
is  not  directly  relevant  to  the  present  research.  What  is  of 
direct  relevance  is  the  amount  of  information  contained  in  a 
given  type  of  stimulus  and  whether  this  quantity  is  related 
to  the  recognizability  of  the  stimulus.  The  number  of 
possible  causes  for  a  sound  signal,  as  quantified  by  the 
information  metric,  is  itself  being  treated  as  a  dimension  or 
property  of  the  stimulus  in  question.  Thus,  in  analyzing 
sounds,  no  special  assumptions  need  be  made  regarding  the 
processing  and  the  transmission  of  information,  other  than 
the  assumptions  that  they  do  take  place  and  that  a  listener's 
responses  reflect  them.  On  this  account,  the  standard 
criticisms  of  information  theory  do  not  apply  to  the  present 
research. 

This  use  of  the  information  measure  does,  however, 
present  some  difficulties  in  its  calculation.  In  prior 
studies,  the  experimenter  could  specify  a'  priori  the  number 
of  stimulus  (and  response)  alternatives  as  well  as  the 
probability  values  of  the  stimuli.  For  example,  a 
participant  would  be  seated  in  front  of  a  panel  on  which  ten 


light  bulbs  were  attached.  The  participant  would  be 
requested  to  respond  according  to  which  bulb  was  lit.  Thus 
the  number  of  stimuli  was  built  into  the  design  of  the 
experiment  and  their  probabilities  (e.g.,  frequencies)  were 
under  the  control  of  the  experimenter.  Uncertainty  could  be 
manipulated  by  varying  either  the  number  of  possible  stimuli 
or  their  relative  frequencies.  Calculation  of  the 
information  statistic  becomes  problematic  when  there  is  only 


one  exposure  to  a  given  stimulus  because  it  is  impossible  to 
approximate  probabilities  on  the  basis  of  stimulus 
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frequencies.  This  difficulty  was  circumvented  by  Bartz 
(1971)  who  demonstrated  each  of  the  alternative  stimuli  to 
the  participant  before  taking  response  times.  Bartz  was  able 
to  do  so  because  he  could  control  the  number  and  type  of 
stimuli.  In  the  present  research  it  is  impossible  (at  least 
at  the  present  time)  to  specify  on  theoretical  grounds  the 
number  of  operative  stimulus  alternatives. 

An  alternative  way  of  computing  the  information 
statistic  (Balias  &  Howard,  in  press;  Balias,  Sliwinski  & 
Harding,  1986)  relies  upon  the  actual  identification 
responses  given  by  the  listeners.  Participants  are  presented 
with  sounds  and  required  to  identify  each.  The  listeners' 
identification  responses  are  sorted  to  determine  how  many 
different  responses  were  given.  The  number  of  different 
responses  is  used  to  determine  the  number  of  alternatives  and 
their  relative  frequencies  are  used  to  determine  the 
probability  values.  Take  for  example  a  situation  where  ten 
listeners  were  presented  with  a  "click-click"  sound,  and  five 
responded  "stapler,"  three  responded  "light  switch,"  and  two 
persons  responded  "ball  point  pen. ”  In  this  circumstance, 
the  number  of  alternatives  would  be  3,  with  probability 
values  of  0.5,  0.3  and  0.2,  respectively.  The  H  value  in 
this  example  would  be  1.48. 

There  are  three  aspects  of  this  method  of  computing 
uncertainty  values  that  enjoy  no  precedent  and  thus  require 
further  comment.  First,  viewed  in  the  context  of  traditional 
information  theory,  this  method  makes  the  tacit  assumption 
that  response  uncertainty  can  be  used  to  approximate  stimulus 
uncertainty.  The  validity  of  this  assumption,  at  least  as  a 
working  hypothesis,  is  crucial  to  the  present  research  since 
it  accepts  information  as  a  property  of  the  stimulus.  For 
this  assumption  to  be  valid  in  the  context  of  information 
theory,  one  condition  must  be  true,  namely,  that  there  is  no 
significant  difference  between  stimulus  uncertainty  and 
response  uncertainty.  That  is,  information  transmission  must 
be  high,  for  only  if  little  information  is  lost  between  the 
source  of  a  signal  and  its  destination,  can  this  condition  be 
realized.  This  seems  to  be  a  plausible  assumption  since  Hoge 
&  Lanzetta  (1968)  demonstrated  that  actual  response 
uncertainty  tracked  objective  uncertainty  that  is  calculable 
a  priori  (prior  to  any  actual  responses).  This  assumption 
can  be  supported  by  certain  aspects  of  experimental  design. 

In  particular,  response  alternatives  should  not  be  restricted 
to  less  than  the  number  of  stimulus  alternatives.  In  the 
present  experiments  this  was  assured  by  avioding  sounds  which 
would  be  unfamiliar  to  the  listeners. 

The  second  aspect  of  this  method  that  merits  further 
discussion  is  the  fact  that  the  alternative  causes  are  not 
definable  a'  priori  and  could  not  all  be  presented  during  the 
experiment.  For  example,  a  presented  "bang"  sound  might  have 
three  possible  sources:  a  firecracker,  a  car  backfire,  or  a 
gun.  If  the  present  research  was  performed  analogously  to 


traditional  research,  all  the  stimulus  possibilities  would  be 
presented  to  each  participant.  In  this  manner,  each 
possibility  could  be  specified  a  priori.  The  proposed  method 
can  specify  stimulus  alternatives  only  in  an  ad  hoc  fashion, 
by  examining  the  actual  identification  responses  given  by  the 
participant.  Indeed  the  only  feasible  method  to  determine 
the  operative  stimulus  alternatives  is  to  infer  them  on  the 
basis  of  the  responses  actually  given. 

A  third  issue  has  to  do  with  the  possibility  that  the 
responses  are  not  indicative  of  actual  stimulus  properties. 
Because  the  verbal  reports  of  the  listeners  are  used  both  to 
calculate  relative  probabilities  and  to  specify  the  stimulus 
categories,  there  is  a  risk  that  if  the  participants  are  not 
accurately  reporting  relevant  and  reasonable  alternatives, 
the  acquired  data  are  meaningless.  There  is  reason  to 
believe  that  this  is  not  the  case.  When  this  method  was  used 
to  calculate  the  information  measure,  a  high  correlation  was 
obtained  between  information  and  choice  reaction  time.  If 
Hick’s  law  is  assumed  to  cover  choice  reaction  time  in  the 
identification  of  non-speech  sounds,  then  methods  of 
calculating  H  would  be  evaluated  according  to  their  fit  with 
the  linear  relationship  described  by  this  law.  Using  this 
method  of  computing  H,  Balias,  Sliwinski,  and  Harding  (1986) 
demonstrated  a  significant  correlation  (r  =  .66,  p  <  .001) 
between  H  and  mean  choice  reaction  times,  suggesting  that  an 
adequate  measure  of  information  had  been  derived.  Experiment 
1  was  a  replication  of  Balias  et  a 1.  with  a  refined  procedure 
and  a  wider  variety  of  sounds. 

To  test  the  validity  of  the  identification  responses 
further,  these  responses  could  be  used  to  select  stimuli  in  a 
follow-up  experiment  designed  to  test  stimulus  confusability 
in  an  identification  task.  If  the  alternative  causes  of  a 
sound  suggested  by  identification  responses  are  poorly 
discriminated  when  presented  for  forced-choice 
identification,  then  the  validity  of  the  initial  responses 
will  be  confirmed.  This  result  would  demonstrate  that  the 
alternatives  provided  by  listeners  reflect  the  possible 
stimulus  alternatives.  Experiment  2  tested  the  validity  of 
participants’  responses  in  this  manner. 

A  final  consideration  is  that  the  definition  of  the 
information  measure  requires  calculations  to  be  performed  on 
the  basis  of  probabilities.  However,  in  most  cases,  these 
probabilities  are  estimated  from  frequencies  and  proportions. 
Despite  the  adequacy  of  approximating  probabilities  from 
frequencies,  MacRae  (1971)  noted  that  'the  mean  log 
proportion  is  lower  than  the  mean  log  probabi 1 ity "  (p.270). 
Thus  empirical  measures  of  information  consistently 
underestimate  the  quantity  of  information  in  the  population. 
Underestimates  can  be  corrected  by  a  technique  analogous  to 
the  way  that  sample  variance  is  corrected  to  obtain  a  better 
estimate  of  population  variance.  Carlton  (1969)  has  proposed 
a  method  of  correcting  for  this  underestimation  using  the 


following  equation: 
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where  Pi  =  the  proportion  of  events  of  type  i,  qt  =  1-Pi ,  k  = 
the  number  of  categories,  and  n  =  the  number  of  observations. 
Carlton's  method  is  of  little  use  because  it  requires 
knowledge  of  the  probabilities  and  if  these  probabilities 
were  known  the  information  measure  could  be  computed  directly 
without  bias.  There  are  several  techniques  that  can  employ 
Carlton's  equation  if  the  general  nature  of  the  distribution 
of  the  probabilities  is  known.  The  technique  most  applicable 
to  the  present  research  is  the  "raw  bias"  method  described  by 
MacRae  (1971).  This  method  takes  the  distribution  of  sampled 
frequencies  as  representative  of  the  population  distribution. 
Thus,  the  empirically  derived  proportions  serve  as  the 
probabilities  in  the  above  equation. 


The  results  reported  by  Balias,  Sliwinski,  and  Harding 
(1986)  were  not  based  upon  the  calculation  of  unbiased 
measures  of  information.  Instead  of  employing  a  correction 
factor  in  the  computation  of  the  information  measures, 
listeners  were  asked  to  name  alternative  responses.  This 
served  to  boost  both  the  k  and  n  values  which  would  decrease 
bias.  Unbiased  information  measures  were  recalculated  from 
the  original  data  but  the  correlations  did  not  improve  over 
those  originally  obtained  and,  in  some  instances,  were 
substantially  lower.  Use  of  the  raw  bias  method  of 
correcting  for  bias  is  not  as  useful  as  simply  asking 
listeners  to  provide  stimulus  alternatives. 

In  summary,  the  present  usage  of  H  to  assess  the  causal 
uncertainty  of  a  sound  and  directly  estimate  an  analogue  to 
the  conditional  probabilities  in  Figure  1  is  supported  on 
both  logical  and  empirical  grounds.  Nonetheless,  issues 
remain  and  the  experiments  were  designed  to  investigate 
several  of  these  issues. 


Experiment  1 


The  first  experiment  replicated  and  refined  the  study 
reported  by  Balias,  Sliwinski,  and  Harding  (1986)  on  the 
relationship  between  response  uncertainty  and  identification 
time.  In  that  study,  a  linear  relationship  was  found  between 
these  two  variables  supporting  the  view  that  the  Hick-Hyman 
law  might  be  relevant  to  the  identification  of  environmental 
sounds.  However,  two  aspects  of  that  study  limited  its 
implications.  First,  the  stimuli  included  animal  sounds 
which  were  recognized  more  quickly  than  the  rest  of  the 
sounds,  but  not  always  more  accurately.  Second,  the 
listeners  received  little  practice  in  the  experimental 
procedure  and  consequently,  the  variance  in  the  data  was 
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large.  In  the  present  experiment,  the  animal  sounds  were 
used  for  practice  and  excluded  from  thr  test  sounds.  In 
addition,  the  sampling  rate  for  digitizing  and  producing  the 
stimuli  was  increased  to  enhance  the  :.J,delity  of  the  sound 
reproduction.  Furthermore,  the  role  oi  prior  experience  with 
the  sounds  was  assessed  by  having  the  listeners  rate  their 
familiarity  with  the  events  that  produced  the  stimuli.  The 
purpose  of  the  experiment  was  identical  to  the  previous 
study:  to  determine  the  relationship  between  response 
uncertainty  and  identification  time. 


Method 

Participants.  Thirty  undergraduate  students  volunteered 
as  listeners  in  this  study  and  were  paid  for  their 
participation.  The  ages  of  the  participants  ranged  from  13 
to  27  with  most  between  the  ages  of  20  and  25.  There  were  14 
women  and  16  men.  None  of  the  listeners  had  hearing 
disorders.  A  majority  had  received  musical  training  either 
instrumental  or  voice. 

Stimuli.  Forty-eight  sounds  (7  practice  sounds  and  41 
test  sounds,  described  in  Table  1)  were  obtained  from  several 
sound-effects  records,  and  were  digitized  at  20  kHz  for  1.5s 
through  a  low-pass  filter  set  at  10  kHz.  A  0.5s  section  of 
the  sample  was  selected  for  each  stimulus,  and  was  generated 
with  a  digital-to-analog  converter  (DAC)  at  20  kHz,  and 
passed  through  a  low-pass  filter  set  at  10  kHz.  Wave  forms 
for  each  of  the  sounds  were  plotted  and  analyzed  using  the 
ILS  Software  Package.  For  each  participant,  the  practice 
session  consisted  of  animal  sounds;  the  test  session, 
however,  consisted  of  environmental  sounds.  The  test  sounds 
were  presented  in  random  order  to  control  for  order  effects 
that  might  arise  because  there  were  several  impact  sounds  and 
several  explosion  sounds.  The  sounds  were  selected  to 
represent  a  variety  of  environmental  sounds,  to  pose  both 
easy  and  difficult  identification  problems  (and  accordingly, 
a  reasonable  uncertainty  range),  and  to  be  completed  within  a 


0.5s  duration  if  the  sound  was  noncontinuous .  The  sounds 
were  presented  at  a  comfortable  listening  level. 

Procedure.  Participants  were  tested  individually  through 
interaction  with  a  microcomputer  which  presented  both 
instructions  and  stimuli  and  obtained  responses  from  a 


standard  keyboard.  The  experiment  consisted  of  two  parts.  In 
part  one,  the  listeners  were  instructed  to  press  the  space 
bar  to  initiate  a  sound  and  to  press  it  again  as  soon  as  they 
had  a  reasonable  idea  about  the  cause  of  the  sound.  On  each 
trial,  the  time  between  the  onset  of  the  sound  and  the  space 
bar  press  was  recorded.  The  listeners  then  typed  an 
identification  of  the  sound,  being  instructed  to  provide  both 
a  noun  and  a  verb.  After  completing  the  7  practice  sounds, 
the  listeners  continued  with  the  41  test  sounds. 
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Table  1 


Description  and  Source  of  Stimuli 


Sound 


Description 


Source 

Record/Volume 

Side/Band 


1.  Telephone 
ringing 

2.  Clock 
ticking 

3 .  Car  horn 

4.  Doorbell 


5.  Automatic 
rifle 

6 .  Riverboat 
whistle 

7.  Water 
dripping 

8.  Bellbuoy 

9 .  Foghorn 

10.  Water 
bubbling 

11.  Bugle 

12.  Rifleshot 
indoors 

13.  Lawn  mower 


14.  Church-bell 
tolling 

15.  Swish 


16.  Knocking 
on  door 

17.  Flush 


high-pitched  ringing 
three  clicking  sounds 

medium-pitched  horn 

two  high-pitched  chimes  with  the 
first  higher  than  the  second 

burst  of  five  shots 
medium-pitched  whistle 


SFX/5/1/6 

SE/2/B/10 

SE/13/B/4 

CBS/3/1/16 

SE/13/B/13 

SE/13/A/15 


high-pitched  water  drip 

high-pitched  bell 
low-pitched  whistle 
continuous  soft  bubbling 

separate  notes  increasing  in  pitch 
single  low-pitched,  muffled  shot 

continuous,  modulated,  low-pitched 
motor 

two  high-pitched  bells 


Live 

recording 

AU/4/B/18 

SE/13/A/13 

AU/4/A/11 

AU/4/B/6 

SE/2/A/21 

SFX/1/1/16 

SE/2/A/8 


oar  being  rowed  in  water;  SFX/2 

sound  of  water  flowing  smoothly 

three  quick  knocking  sounds  CBS/2/2/11 


initial  phase  of  a  toilet  CBS/1/2/17 

flush  with  rushing  water 
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18.  Footsteps 

19.  Fireworks 


Table  1  (continued) 
woman  walking  quickly 
powerful  muffled  explosion 


SE/13/B/3 

SFX/8/2/11 


20.  Cigarette 
lighter 

plastic  lighter  being  lighted  with 
quick,  grinding,  high-pitched 
metallic  sound  followed  by  hissing 
sound  of  flame 

Live 

recording 

2 1 .  Touch  tone 
telephone 

single  high-pitched  tone 

SFX/5/1/10 

22 .  Door 

opening 

two  high-pitched  metallic  latching 
sounds,  one  followed  immediately 
by  the  other 

CBS/2/2/10 

23.  Bacon 

sizzling 

sounds  of  bubbling,  frying  oil  in 
a  frying  pan 

AU/4/A/8 

24.  Hammering 

three  quick  tapping  sounds 

SFX/3/2/13 

25.  Submarine 
dive  horn 

horn  of  increasing  and  then 
decreasing  pitch 

SFX/ 1/2/21 

26 .  Person 

walking  in 
clogs 

two  footsteps  of  person  walking 
in  wooden  clogs,  each  step 
contains  two  impact  sounds 

SFX/3/1/25 

27.  Ignition 
of  car 

three  revolutions  of  car  engine 
being  started 

SE/13/A/9 

28.  Chopping  of 
tree 

single  impact  sound  of  an  axe 
cutting  into  a  tree 

SFX/1/1/18 

29.  Power  saw 

high  pitched  metallic  whine 

SFX/7/2/23 

30.  Key  in  lock 

two  latching  sounds,  slightly 
muffled 

SFX/ 1/2/5 

31.  Cork  popping 

popping  sound  followed  by  soft 
impact  of  cork 

SFX/5/1/13 

32.  File  cabinet 
drawer 

sound  of  metallic  wheels  rolling  on 
on  a  metallic  track  followed  by 
the  closing  of  metallic  drawer 

SFX/3/2/6 

33 .  Door 

closing 

two  low-pitched  impact  sounds, 

one  followed  immediately  by  the  other 

CBS/2/2/9 

34.  Car 

backfire 

one  explosive  backfire  followed 
by  an  echoing  clunk 

SE/13/A/9 

35.  Jail  door 
closing 


two  echoing  impact  sounds,  one 
quickly  followed  by  the  other 


SFX/1/2/3 


Table  1  (continued) 


36.  Rifle  shot 
outdoors 

single  high-pitched  shot 

SE/2/A/19 

37.  Light 
switch 

pull-cord  light  switch  consisting 
of  two  high-pitched  transients 

Live 

recording 

38.  Stapler 

stapler  being  pressed,  consisting 
of  two  low-pitched  transients 

Live 

recording 

39.  Telephone 
being 
hung  up 

phone  receiver  being  placed 

into  its  cradle  producing  two  impacts 

SFX/5/1/8 

40.  Sawing  of 
tree 

a  stroke  of  a  handsaw  followed  by  a 
return  movement 

SFX/1/1/21 

41.  Electric 
lock 

sequence  of  buzz  and  then  clicking 
sound  of  lock  opening 

SFX/1/1/24 

References  for  sources  of  recordings 


SE/2:  Valentino,  T. J. (Producer ) .  Sound  Effects  Vol.II  [Album]. 

New  York,  N.Y. :  Thomas  J  Valentino  Inc. 

SE/13:  Valentino,  T. J. (Producer ) .  Sound  Effects  Vol.XIII 
[Album].  New  York,  N.Y. :  Thomas  J  Valentino  Inc. 

AU/4:  Holzman,  J. (Producer ) .  Authentic  Sound  Effects  Vol.IV 
[Album].  New  York,  N.Y. :  The  Elektra  Corporation. 

CBS/1.2.3:  Hoppe,  E.  and  Dulberg, J. (Producers ) .  The  New  CBS  Audio- 
File  Sound  Effects  Library,  Vol.II  [Album]  (1982). 

New  York,  N.Y. :  CBS  Records.  (CBS/1  represents  the 
first  record  within  the  volume,  CBS/2  represents  the 
second  record,  and  CBS/3  represents  the  third  record). 

SFX/1, 2, 3, 5, 7, 8 :  White,  V. ( Producer ) .  SFX  Sound  Effects  [Albums]. 
New  York,  N.Y. :  Folkways  Records  and  Service  Corp. 


In  the  second  part  of  the  experiment,  each  of  the  41 
test  sounds  was  presented  again  to  give  the  listeners  an 
opportunity  to  provide,  if  they  wished,  any  reasonable 
alternatives  to  their  original  responses.  Upon  completion  of 
the  experiment,  the  participants  were  asked  to  complete  a 
questionnaire  which  asked  for  biographical  information  needed 
to  assess  important  characteristics  of  the  sample  (e.g., 
extent  of  formal  music  training).  The  questionnaire  also 
included  a  set  of  rating  scales  that  the  participants 
completed  to  assess  their  familiarity  with  the  events  which 
had  produced  the  sounds.  This  six-point  scale  was  anchored 
by  the  terms  "familiar"  and  “unfamiliar" .  Cohen  and  Cohen 
(1975)  claim  that  scales  of  this  type  have  interval 
properties  for  most  purposes  of  analysis.  The  participants 
were  not  told  that  these  events  had  been  the  same  ones 
presented  to  them  in  the  experiment. 


Results  and  Discussion 

As  expected,  the  distribution  of  response  times  was 
skewed  (Figure  3).  Response  times  that  were  greater  than 
three  standard  deviations  from  the  mean  for  the  particular 
sound  were  discarded  in  further  analyses.  With  this  culling 
of  outliers,  response  times  averaged  across  listeners  ranged 
from  1253  ms  for  the  sound  of  a  telephone  ringing  to  6823  ms 
for  the  sound  of  an  electric  buzzer  lock  (see  Table  2). 

The  identification  responses  were  sorted  by  two  research 
assistants  on  the  project  and  a  third  person  who  was 
unfamiliar  with  the  research  hypothesis.  This  third  sorter 
was  a  professional  technical  writer.  All  three  individuals 
sorted  the  responses  into  categories  of  similar  events  using 
these  criteria: 

-  Phrases  using  exactly  the  same  noun  and  verb  should  be 
placed  in  the  same  category. 

-  Phrases  using  nouns  and  verbs  that  are  synonymns 
should  be  placed  in  the  same  category. 

-  Phrases  describing  the  same  physical  scene,  as  would 
be  used  to  describe  a  scene  in  a  movie  script,  should 
be  placed  in  the  same  category. 

-  A  phrase  missing  a  verb  such  as  in  the  response 
"door",  should  be  set  aside  until  the  first  pass  was 
completed.  These  phrases  should  be  placed  into  the 
most  frequent  category  which  uses  the  noun  contained 
in  the  phrase. 

-  Responses  which  are  not  specific  enough  to  be 
categorized, (  e.g.,  "object  hitting  another  object"  or 
"item  falling"),  should  be  excluded  from  the  sorting. 


Frequency  of  occurence 


Table  2 


Results  from  Experiment  2 


SOUND 

MRT 

HI 

H2 

H3 

FAM 

CORR 

1. 

Tele  Ring 

1253 

0.44 

0.44 

0.44 

1.2 

.  09 

2. 

Ticking 

1592 

1. 34 

1.07 

0.  98 

1.3 

-.  14 

3. 

Car  Horn 

1611 

0.75 

0.75 

0.75 

1.4 

-.  08 

4. 

Doorbell 

1642 

0.  58 

0.  58 

0.00 

1.3 

-.  12 

5. 

Autorifle 

1666 

2.28 

1.85 

1.89 

3.6 

.  25 

6. 

Riverboat 

1751 

1. 90 

1. 26 

0.98 

3.  5 

.  11 

7. 

Drip 

1831 

2.22 

0.99 

1.  14 

1.7 

.03 

8. 

Buoy 

1912 

3.03 

2.81 

2.21 

3.9 

-.06 

9. 

Foghorn 

2135 

2.26 

2.24 

1.22 

2.7 

.  39* 

10. 

Bubble 

2325 

3.72 

2.68 

2.75 

1.6 

.  49* 

11. 

Bugle 

2356 

2.20 

2.  19 

1. 41 

2.3 

.  58* 

12. 

Rifle  In 

2371 

3.21 

2.97 

2.49 

3.9 

.  33 

13. 

Mower 

2596 

3.77 

3.65 

2.73 

2.6 

.24 

14. 

Church 

2614 

2.88 

2.89 

1.68 

1.9 

.  03 

15. 

Swish 

2745 

3.91 

3.37 

0.70 

1.5 

.  19 

16. 

Door  Knock 

2779 

2.  16 

1.98 

1. 44 

1.3 

.02 

17. 

Flush 

2779 

2.36 

1.84 

1. 25 

1.4 

.  50* 

18. 

Footstep 

2823 

3.48 

2.53 

2.04 

1.2 

.  06 

19. 

Firework 

2926 

3.32 

3.23 

2.93 

2.9 

.33 

20. 

Lighter 

3210 

3.46 

3.54 

3.  18 

3.2 

.  00 

21. 

Touch  Tone 

3305 

4.07 

2.36 

2.84 

1.5 

.  14 

22. 

Open  Door 

3335 

3.20 

2.94 

2. 49 

1.5 

-.  22 

23. 

Frying 

3422 

3.  42 

3.56 

2.92 

2.  1 

.  24 

24. 

Hammer 

3624 

3.34 

3.  13 

2.97 

2.2 

.  32 

25. 

Sub  Horn 

3695 

3.60 

3.  51 

3.07 

4.6 

.  18 

26. 

Clogs 

3799 

3.  11 

3.36 

2.23 

3.6 

.  28 

27. 

Ignition 

3802 

3.84 

3.27 

2.83 

1.9 

.26 

28. 

Chop 

4071 

4.  96 

4.  51 

3.69 

3.6 

.  27 

29. 

Power  Saw 

4113 

4.95 

4.  45 

3.77 

3.2 

.  02 

30. 

Lock  Key 

4240 

3.44 

3. 67 

2.96 

2.0 

-.  16 

31. 

Corkpop 

4296 

4.  10 

3.60 

3.44 

2.5 

.  43 

32. 

Cabinet 

4305 

3.34 

3.48 

2.87 

2.9 

-.04 

33. 

Close  Door 

4372 

3. 02 

2.90 

2.  74 

1. 4 

-.  06 

34. 

Backfire 

4610 

3.99 

3.72 

3.  13 

3.4 

.  29 

35. 

Jail  Door 

5197 

3.96 

4.  13 

1.50 

3.8 

.  28 

36. 

Rifle  Out 

5240 

4.  46 

3.88 

3.  11 

3.2 

-.06 

37. 

Switch 

6022 

4.  53 

4.  40 

3.79 

2.  1 

-.  32 

38. 

Stapler 

6055 

4.72 

4.65 

4.  17 

2.2 

.  04 

39. 

Hang  Up 

6660 

4.97 

4.78 

4.44 

2.  1 

-.23 

40. 

Tree  Saw 

6792 

4.81 

4.72 

4.05 

3.7 

.  25 

41. 

Elec  Lock 

6823 

4.  18 

4.  11 

3.32 

3.8 

.  16 

MRT  =  mean  reaction  time(ms);  HI,  H2,  H3  =  Uncertainty  values  for 
three  sorters  ;  FAM  =  average  familiarity  rating  from 
biographical  questionnaire;  CORR  =  product  moment  correlation 
between  FAM  and  MRT  (*  significant  at  the  .05  level). 


The  initial  and  alternative  responses  were  treated 
equivalently  in  the  sorting.  These  sortings  were  then  used 
to  compute  the  uncertainty  statistic  using  the  equation: 

r% 

H  =  ~2Z  Pi  log*  Pi 

i  —  1 

where  H  is  the  amount  of  uncertainty  or  causal  entropy,  p» is 
the  proportion  of  events  of  category  i  and  n  is  the  number  of 
categories.  As  illustrated  in  Figures  4-6,  the  uncertainty 
values  obtained  from  the  three  sorters  are  linearly  related 
to  the  log  of  average  response  times.  Produce  moment 
correlations  were  .85,  .89,  and  .82  for  the  two  research 

assistants  and  the  naive  sorter.  The  increase  in  the 
correlation  resulting  from  a  log  transformation  of  response 
time  was  significant  for  two  of  the  sortings,  t(38)  =  3.83, 
2.64  and  1.43,  p  <  .001,  .02,  .20,  for  a  test  of  differences 
between  dependent  correlations.  Response  times  are  not 
usually  transformed  by  a  log  function  in  studies  of  the  Hick- 
Hyman  law  and  this  result  is  inconsistent  with  the 
relationship  reported  by  Balias,  Sliwinski  and  Harding 
(1986).  A  log  transformation  produced  better  linearity  in 
these  results  in  part  because  the  response  time  distribution 
in  the  present  experiment  was  truncated  at  the  shorter  times 
by  the  exclusion  of  the  animal  sounds  which  Balias  et  al. 
found  produced  the  quickest  responses.  Correlations  between 
uncertainty  and  mean  response  time  without  the  log 
transformation  were  greater  than  the  correlation  found  by 
Balias  et  al.  but  even  the  largest  difference  (using  H2) 
difference  was  not  significant,  z  =  1.63,  p  =  . 10  for  a  test 
of  differences  between  independent  correlations.  The 
relationships  between  response  time  and  measures  of  stimulus 
intensity  (i.e.,  peak  voltage,  average  power)  were  not 
significant,  a  finding  consistent  with  Balias  et  al.  There 
were  no  significant  components  of  higher  order  for  the 
uncertainty  variable. 

The  reliabiliies  of  the  three  sorters  were  significant, 
r ( 1&2 )  =  .95,  r ( 1&3 )  =  .87,  r(2&3)  =  .87,  p  <  .001.  The 
reliability  coefficients  for  sorter  #3 — the  naive  sorter — 
show  that  experimenter  bias  not  is  a  factor  in  these  results 
and  indicate  that  the  sorting  procedure  itself  is  reliable 
when  the  sorting  criteria  are  followed.  The  magnitude  of 
these  reliability  coefficients  suggests  that  uncertainty 
values  might  be  stable  for  particular  sounds.  If  this  is  the 
case,  these  values  could  be  used  as  a  measure  of  the 
recognizability  of  a  sound  in  much  the  same  way  that  measures 
of  familiarity  have  been  developed  for  words  and  nonsense 
syllables.  To  test  this  possibility,  it  would  be  necessary 
to  conduct  a  study  similar  to  the  present  one  but  using 
different  examples  of  the  41  sounds.  The  two  sets  of 
uncertainty  values  could  be  compared  to  determine  if  the 
values  for  particular  sounds  are  consistent. 
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Figure  4.  Relation  of  mean  response  time  for  test  sounds  to 
causal  uncertainty  calculated  from  sorting  #1  (HI) 
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Figure  6.  Relation  of  mean  response  time  for  test  sounds  to 
causal  uncertainty  calculated  from  sorting  #3  (H3) 
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This  strategy  was  pursued  in  this  experiment  by 
examining  the  uncertainty  values  of  15  stimuli  that  were 
common  to  this  study  and  to  the  study  by  Balias  et  al. 

(1986).  It  should  be  emphasized  that  the  stimuli  were  common 
at  the  level  of  cause  but  not  at  an  acoustic  level.  In  other 
words,  the  two  studies  used  different  exemplars  of  the  15 
sounds.  In  addition,  the  digitizing  rate  was  different  in 
the  two  studies.  Two  sortings  were  available  for  the  first 
set  of  sounds  and  three  for  the  second  set.  These  five 
sortings  were  performed  by  five  different  individuals  using 
the  criteria  described  previously.  Uncertainty  values  from 
these  sortings  are  shown  in  Table  3.  Reliability 
coefficients  between  the  different  sorters  are  uniformly  high 
as  shown  in  Table  3.  The  coefficients  with  the  naive  sorter 
are  reduced  markedly  by  the  discrepant  uncertainty  value  for 
the  splash  sound.  This  sorter  used  only  two  categories  for 
the  responses  to  this  sound  choosing  to  focus  on  whether  or 
not  the  splash  involved  a  human  action.  The  other  sorters 
discriminated  between  the  types  of  human  actions  and  the 
types  of  environmental  events.  With  this  exception,  these 
results  indicate  that  the  uncertainty  measures  are  consistent 
not  only  for  different  sorters  on  a  specific  set  of  sounds 
but  also  for  different  examples  of  sounds  and  different 
sorters . 

Most  of  the  uncertainty  values  are  consistent  across  the 
sorters  and  studies.  This  consistency  would  be  expected  if 
the  uncertainty  values  are  considered  to  be  a  stimulus 
property.  The  basis  for  treating  these  values  as  a  property 
of  the  stimulus  and  not  of  the  observer  is  the  inherent 
functional  relationship  between  the  physics  of  an  event  and 
its  acoustic  "signature".  Some  events  produce  similar 
acoustic  signatures  and  are  accordingly  confused.  The  notion 
of  confusable  or  indiscrirainable  acoustic  signatures  suggests 
that  these  uncertainty  values  can  be  used  as  a  quantitative 
measure  of  the  recognizabi 1 ity  of  these  sounds.  But  first, 
the  role  of  individual  experience  must  be  addressed.  This  is 
particularly  important  in  view  of  the  finding  that  particular 
sounds  are  reliably  assessed  at  the  same  level  of 
identification  uncertainty.  One  explanation  for  this 
finding  is  that  shared,  prior  experience  with  environmental 
sounds  has  informed  us  about  the  causes  of  some  sounds  more 
than  others.  The  values  could  then  reflect  this  shared 
variation  in  knowledge  about  sounds,  rather  than  a  variation 
in  the  breadth  of  possible  causes  for  a  sound. 

The  issue  of  individual  experience  in  the  identification 
of  these  sounds  was  assessed  through  the  post-test 
questionnaire.  The  participants  were  asked  to  rate  their 
familiarity  with  each  of  the  events  represented  by  the 
stimuli.  These  ratings  were  correlated  with  individual 
response  times  to  determine  the  contribution  of  familiarity 
to  identif ication  response  time.  The  correlations  were 
nonsignificant  except  for  sounds  #9,  #10,  #11, and  #17  (Table 
2).  Average  ratings  across  listeners  correlated 
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significantly  with  both  average  response  time  (r  =  .31, 
p  <  .05)  and  response  uncertainty  derived  from  the  three 
sortings  (r  =  .39,  .48,  .38  respectivly,  p  <  .01,  .001,  .01). 

However,  in  a  multiple  correlation  analysis  with  response 
time  as  the  dependent  variable,  familiarty  was  a 
nonsignificant  variable  after  causal  uncertainty  had  been 
entered  into  the  regression  model,  F(2,38)  =  0.11,  3.66, 

0.11,  p  =  .74,  .06,  .95  for  the  three  sortings).  This  means 

that  causal  uncertainty  was  more  important  than  familiarity, 
as  measured  by  the  questionnaire,  in  accounting  for 
identification  response  time. 


Experiment  2 


This  experiment  was  designed  to  study  the  discrimination 
and  categorization  of  sound  "homonyms''  —  similar  sounds  caused 
by  different  events--and  to  assess  the  validity  of  responses 
given  to  such  sounds.  Identification  responses  in  the  first 
experiment  included  different  causes  for  each  sound.  In  some 
instances,  these  causes  can  be  completely  incongruent.  For 
example,  a  squaky  valve  sound  used  in  studies  by  Balias  and 
Howard  (in  press)  was  thought  by  some  to  be  an  elephant 
trumpeting.  A  question  naturally  arises  about  the  validity 
of  these  verbal  responses.  To  address  this  issue,  one  sound 
was  chosen  for  further  study  to  determine  whether  its 
alternative  causes  are  reasonable.  This  sound  is  produced  by 
a  pull-cord  light  switch.  It  typically  includes  two  "clicks" 
produced  as  the  switch  is  engaged  and  released.  In  pilot 
research  this  sound  was  thought  by  some  listeners  to  be 
caused  by  a  paper  stapler.  A  ball  point  pen  was  also  thought 
to  have  caused  this  dual  click.  Because  these  reported 
causes  are  used  to  calculate  the  uncertainty  measure,  it  is 
important  that  they  not  be  spurious  verbal  responses.  More 
to  the  point,  is  it  reasonable  to  assume  that  a  reported 
alternative  cause  of  a  sound  increases  identification 
uncertainty  because  it  could  have  truly  caused  the  sound? 
Furthermore,  does  the  relative  frequency  of  alternatives 
reveal  the  similarity  between  the  acoustics  of  these 
alternatives  and  in  some  sense  the  probability  of  the 
alternatives  as  causes? 

These  questions  were  addressed  in  this  experiment.  Two 
types  of  events  were  chosen  for  study:  light  switching  and 
paper  stapling.  These  two  events  were  chosen  because  they 
are  relatively  easy  to  produce,  most  listeners  weld  be 
familiar  with  the  events,  and  the  events  produce  a  pattern  of 
complex,  short  duration  trans ients--the  identification  of 
which  is  little  understood.  Multiple  exemplars  of  each  event 
were  produced  by  changing  the  instruments  and  circumstances 
of  the  event.  The  listeners  were  presented  with  each  example 
and  asked  to  identify  it  as  a  switch  or  stapler.  The 
paradigm  was  a  s ingle- interval ,  two-alternative  forced- 
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Method 


Participants.  There  were  twenty  participants  who 
ranged  in  age  from  14  to  30  with  most  20  years  old.  Eleven 
were  women  and  nine  were  men.  None  of  the  participants 
reported  any  hearing  disorders.  All  reported  that  they  had 
heard  the  sounds  of  a  stapler,  pull-chain  switch  and  push- 
dimmer  switch.  Their  recent  experience  with  these  sounds  had 
been  infrequent,  with  five,  seven,  and  six  participants 
reporting  that  they  hear  the  sound  of  a  stapler,  pull  switch, 
and  push  switch, respectively,  less  than  once  a  month.  The 
participants  were  paid  five  dollars  for  participating  in  this 
study. 

Stimuli.  Sixty  stimuli  were  used:  thirty  stapler  sounds 
and  thirty  switch  sounds.  One  of  the  stapler  sounds  was  used 
as  a  practice  stimulus.  The  sounds  were  obtained  under  the 
conditions  listed  in  Tables  4  and  5.  The  sounds  were 
digitized  at  a  20  kHz  sampling  rate  through  a  low  pass  filter 
set  at  10kHz.  The  duration  of  the  sounds  varied  as  shown  in 
Tables  4  and  5.  A  tape  was  made  by  generating  the  sounds 
with  a  DAC  set  at  a  20  kHz  sampling  rate  through  a  low  pass 
filter  set  at  10  kHz.  The  order  of  the  stimuli  was  random. 
The  tape  recorder  had  a  frequency  response  of  30Hz  to  15  kHz. 

Procedure.  Participants  were  seated  at  a  table  where 
they  received  instructions.  They  were  told  that  they  would 
hear  a  series  of  sounds  each  of  which  would  be  either  a 
stapler  or  a  lightswitch,  the  latter  being  either  of  the 
pull-chain  or  push-dimmer  type.  At  this  point,  the 
investigator  placed  the  staplers  and  light  switches  that  had 
been  used  to  produce  the  sounds  on  the  table  in  front  of  the 
participant.  The  participants  were  not  allowed  to  handle  the 
objects,  nor  were  any  sounds  produced  by  these  objects  prior 
to  or  during  the  experiment.  The  participants  were  informed 
that  half  of  the  sounds  would  be  staplers  and  half  would  be 
light  switches.  They  were  then  asked  to  identify  each  sound 
using  a  6-point  scale  which  included  a  confidence  rating: 

1  =  Light  switch,  certain 

2  =  Light  switch,  probable 

3  =  Light  switch,  possible 

4  =  Stapler,  possible 

5  =  Stapler,  probable 

6  =  Stapler,  certain 

Thus  participants  were  required  to  indicate  their  level  of 
confidence  in  their  identification  of  each  sound.  After 
completing  one  practice  sound,  the  listeners  continued  with 
60  test  sounds. 


Production  Characteristics  and  Response 
Categorization  for  30  Stapler  Sounds 
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1.  Stapler  Type  1  was  a  medium  sized  plastic-cased  stapler,  (15.2cm  by  5.1cm) 
Type  2  was  a  metal  stapler,  (20.3cm  by  7.6cm);  and  Type  3  was  a  small 
metal  stapler,  (10.2cm  by  5.1cm). 


2.  Production  characteristics  are  as  follows: 

A=Press  on  wood  desk;  B=Press  in  hand;  C=With  paper;  D=Press  into  foam; 
E=Press  on  metal  wall;  F=No  base;  G=With  base; 


Table  5 


Production  Characteristics  and  Response 
Categorization  for  30  Switch  Sounds 


Sound  Switch  Production  Duration  Reponse  Categorization 

No.  Type1  Characteristics 2  (ms)  (mean  &  SD) 
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1.  Switch  types  1-5  are  chain  pull  switches,  switch  type  6  is 
a  plastic  push-dimmer  switch. 

2.  Production  characteristics  are  as  follows: 

A=Down  pull;  B=Straight  out  pull;  C=Side  pull;  D=Handheld  press; 
E=Side  press;  F=Press  on  wood;  G=Sideways  press;  H=Angled  press. 


Results  and  Discussion 

The  average  response  ratings  for  each  of  the  stimuli  are 
listed  in  Tables  4  and  5  together  with  standard  deviation 
bars.  There  were  significant  differences  in  the  average 
ratings  and  some  stimuli  were  rated  incorrectly.  Using  a 
criterion  of  two  standard  deviations  from  the  mean  of  the 
ratings  for  the  type  of  sound,  stapler  stimuli  #21,  #23  and 
#29  and  light  switch  stimuli  #3,  #4,  #13,  #21,  #22,  #23,  #24, 
#25,  #27,  #28,  and  #29  were  incorrectly  identified.  These 
include  all  but  two  of  the  push-dimmer  switch  stimuli.  The 
duration  of  the  sound  was  not  systematically  related  to 
identification  ratings.  These  results  indicate  that  certain 
examples  of  a  sound  can  be  thought  to  have  causes  other  than 
the  actual  cause. 

Waveform  and  spectral  analyses  of  the  sounds  using  the 
ILS  software  package  revealed  that  several  features  of  the 
stimuli  might  be  important  in  identification  of  a  sound  as  a 
stapler  or  switch.  All  of  the  stimuli  were  characterized  by 
two  transients  as  illustrated  in  Figure  7.  Stapler  stimuli 
were  probably  identified  on  the  basis  of  low-frequency 
components.  The  spectra  of  correctly  identified  stapler 
stimuli  were  characterized  by  low-frequency  components  (e.g., 
stapler  #5  as  shown  in  Figure  7).  The  push-dimmer  switches 
shared  this  characteristic,  and  thus  were  thought  to  be 
stapler  sounds  more  so  than  the  other  types  of  switches. 
Similarly,  the  stapler  stimuli  which  were  characterized  by 
high-frequency  components  (e.g.,  stapler  #29)  were  less 
accurately  identified. 

A  feature  of  the  pull-chain  switches  that  might  have 
been  used  for  identification  was  the  repeated-impulse  pattern 
preceding  the  transients,  especially  the  first  transient 
(Figure  8).  This  pulse  pattern  was  caused  by  the  rolling 
chain  and  was  evident  in  several  of  the  most  accurately 
identified  switches  (switches  #6  and  #9).  A  second  feature 
that  might  have  enabled  listeners  to  identify  the  switches  is 
the  harmonic  pattern  that  was  evident  in  the  first  transient 
of  several  identified  switches  (Figure  9).  This  harmonic 
structure  would  have  been  caused  by  the  reverberation  of  the 
ceramic  light  socket  or  the  hanger  from  which  the  switch 
hung.  This  structure  was  not  evident  in  the  stapler  sounds. 

Two  of  the  sounds  in  this  study — switch  #6  and  stapler 
#17 — were  used  in  the  first  experiment.  Data  from  the  two 
experiments  can  be  compared  to  determine  if  the  free- 
ident if icat ion  of  these  two  sounds  in  the  first  experiment 
reflects  the  ability  of  listeners  to  categorize  these  two 
sounds  in  the  second  experiment.  In  the  first  experiment, 
the  light  switch  was  correctly  identified  in  18.6%  of  the 
responses  and  incorrectly  identified  as  a  stapler  in  4.7%. 

The  ratio  of  switch  to  stapler  responses  was  3.95  :  1.  The 
results  of  the  second  experiment  were  consistent  in  that  this 
stimulus  was  rated  as  the  most  likely  switch.  The  stapler 


sound  was  correctly  identified  in  10.4%  of  the  responses  and 
incorrectly  as  a  switch  in  6.3%.  The  ratio  of  stapler  to 
switch  responses  was  1.65  :  1.  This  was  consistent  with  the 
second  experiment  in  that  this  stapler  was  rated  in  the 
middle  of  the  response  scale.  An  alternative  analysis  of 
these  results  revealed  that  this  consistency  is  approximated 
by  a  linear  function.  In  Figure  10,  the  response  percentages 
from  Experiment  1  are  plotted  against  the  ratings  of 
Experiment  2,  expressed  as  deviations  from  the  appropriate 
endpoint  of  the  response  scale.  The  endpoint  for  these 
deviations  was  the  correct  end  of  the  scale  for  the 
particular  sound.  Four  data  points  are  possible  given  the 
two  sounds  and  two  response  alternatives  for  each  sound.  The 
results  indicate  that  a  linear  function  describes  the 
consistency  of  the  two  experiments.  This  function  expresses 
the  relationship  between  the  proportion  for  two  alternative 
responses  given  in  unconstrained,  unprompted  identification 
and  the  rated  position  of  the  stimulus  along  a  scale  between 
these  two  alternatives.  This  finding  means  that  when  one 
group  of  listeners  was  asked  to  identify  these  two  sounds, 
the  response  proportions  for  two  alternatives  reflected  the 
ability  of  other  listeners  to  categorize  the  sounds  into 
these  two  alternatives.  In  other  words,  the  response 
proportions  found  in  Experiment  1  may  reflect  the  similarity 
in  the  acoustics  of  these  alternatives  and  the  probability  of 
these  alternatives  as  causes.  This  finding  further  supports 
the  use  of  the  uncertainty  value  as  a  measure  of  the 
recognizability  of  these  sounds. 


Experiment  3 


Although  there  is  evidence  that  listeners  engage  in  a 
cognitive  evaluation  of  alternatives  in  identifying  the 
environmental  sounds  that  have  been  used,  there  is  little 
direct  evidence  about  the  nature  of  this  evaluation  process. 
The  evidence  to  date  is  based  upon  averages  of  reaction  times 
across  listeners  and  upon  the  sorted  responses  of  a  group  of 
listeners.  Aggregated  results  such  as  these  are  weak 
evidence  that  individuals  engage  in  an  evaluation  of 
alternatives  and  that  the  number  and  conditional  probability 
of  these  alternatives  are  reflected  in  identification 
response  time.  Furthermore,  the  response  times  could  be  due 
to  response  selection  even  though  Balias  and  Howard  (in 
press)  found  evidence  to  the  contrary. 

In  order  to  determine  if  individual  listeners  engage  in 
a  process  that  is  sensitive  to  the  conditional  probability  of 
alternative  causes,  a  memory  priming  study  was  designed. 
Listeners  were  presented  with  phrases  suggesting  causes  for  a 
sound  which  they  were  about  to  hear.  These  phrases  were 
taken  from  the  results  of  Experiment  1  and  represented  two 
levels  of  causal  probability  for  the  sounds.  The  listeners 
were  given  adequate  time  to  read  the  phrase  and  then 


presented  with  a  sound.  Their  task  was  to  decide  quickly  and 
accurately  if  the  sound  could  have  resulted  from  the  event 
described  with  the  phrase.  If  individual  listeners  engage  in 
a  cognitive  process  that  must  link  sounds  to  causes,  and  if 
the  time  course  of  this  process  is  determined  by  the 
conditional  probability  of  the  cause,  then  response  time  for 
positive  decisions  should  be  quicker  for  phrases  describing 
high  probability  causes.  This  effect  should  be  observed  in 
individual  listeners. 


Method 

Participants.  Ninteen  students  volunteered  as  listeners 
in  this  experiment  and  were  paid  for  their  participation. 

The  ages  of  the  participants  ranged  from  17  to  29.  There 
were  10  females  and  9  males.  None  reported  any  hearing 
disorder.  Eleven  had  received  formal  training  in  music  or 
vo ice . 

Stimuli.  F’orty-one  environmental  sounds  were  presented, 
of  which  29  were  test  stimuli  and  12  were  stimuli  for  catch 
trials.  Sounds  were  sampled  and  digitised  as  described  in 
Experiment  1.  The  sounds  were  the  same  as  the  the  stimuli 
presented  in  Experiment  1.  Practice  sounds  were  the  eight 
animal  sounds  also  used  in  Experiment  1. 

For  each  of  the  41  stimuli  two  verbal  probes  were 
selected.  One  of  the  probes  was  a  high-probability  cause  of 
the  sound  and  the  other  probe  a  low-probability  cause,  as 
determined  by  the  analyses  performed  in  Experiment  1.  F’or 
example,  if  in  Experiment  1  a  particular  identification 
response  for  a  sound  was  given  20  times  and  another  response 
was  given  for  the  same  sound  only  3  times,  these  responses 
would  be  high-  and  low-probability  causes  respectively.  The 
criteria  employed  in  selecting  causes  used  as  probes  was  as 
follows:  for  a  response  to  be  used  as  high  -probabi 1 ity 

probe,  it  must  have  been  given  at  least  twice  as  frequently 
as  the  response  to  be  used  as  a  low-probability  probe;  and, 
for  a  response  to  be  used  as  a  low-probability  probe,  it  must 
have  been  given  at  least  twice  for  a  particular  sound. 


Procedure.  Listeners  were  seated  in  front  of  a  keyboard 
and  computer  terminal  inside  a  sound  attenuating  booth. 
Instructions  were  displayed  on  the  terminal.  Each 
participant  received  a  practice  session  consisting  of  two 
parts.  In  the  first  part,  listeners  were  acquainted  with  the 
yes''  and  "no"  keys.  Either  the  word  "yes"  or  "no"  was 
displayed  on  the  screen  and  participants  were  required  to 
press  the  appropriate  key  as  quickly  and  as  accurately  as 
they  could.  Each  participant  received  thirty  of  these 
trials.  During  the  second  part  of  the  practice  session, 
participants  were  presented  with  verbal  probes  for  the  eight 
practice  sounds,  just  as  they  would  during  the  test  session. 


Participants  were  instructed  to  fixate  on  a  white  dot 
centered  on  the  screen  prior  to  each  trial.  The  participant 
initiated  each  trial  by  pressing  the  space  bar,  after  which, 
the  probe  would  appear  on  the  screen  for  1.5s.  Then  a  sound 

was  played  over  headphones  to  the  participant.  The 
participant's  task  was  to  decide  as  quickly  and  as  accurately 
as  possible  whether  the  sound  could  have  been  caused  by  the 
event  described  by  the  preceding  phrase.  Each  sound  was 
presented  twice  throughout  the  course  of  the  experimental 
session.  For  half  of  the  participants  the  high-probability 
probes  were  presented  before  the  low-probabi 1 i ty  probes,  and 
for  the  other  half  the  sequence  was  reversed.  Within  these 
two  categories,  sound  order  was  randomised. 


Results  and  Discussion 

The  relevant  data  for  analysis  were  the  response  times 
for  positive  responses  to  valid  (i.e.,  non -catch)  stimuli. 

The  average  response  time  on  trials  with  high-probability 
probes  was  347  ms  faster  than  the  response  time  with  low- 
probability  probes  (1261  ms  and  1608  ms,  respectively). 
Seventeen  of  the  19  participants  responded  more  rapidly  to 
the  high  probability  probes.  An  analysis-of-variance 
verified  that  the  effect  of  probe  type  was  significant,  F 
(1,18)  =  12.58,  p  <  .005.  The  responses  to  24  of  27  sounds 
were  made  more  rapidly  if  a  high-probability  probe  was 
presented.  Two  sounds  had  too  few  responses  on  trials  with 
low-probability  probes  for  this  analysis.  These  two  sounds 
were  the  doorbell  and  the  foghorn  and  the  low-probability 
probes  were  "telephone"  and  "train  hoot”. 

This  result  has  two  important  implications.  First,  the 
determination  of  the  cause  of  a  sound  involves  a  cognitive 
process  that  is  related  to  the  probability  of  the  causes 
being  considered.  Low-probability  causes  take  longer  to 
confirm  than  do  high-probability  causes.  Second,  this  effect 
is  not  due  to  the  framing  of  a  verbal  description  of  the 
cause,  as  could  be  claimed  for  the  results  in  Experiment  1. 
The  participants  in  this  experiment  were  presented  with  a 
description  that  had  been  generated  by  participants  in  the 
first  experiment.  The  participants  in  this  experiment  had 
ample  opportunity  to  read  the  description  before  the  sound 
and  only  had  to  confirm  or  reject  the  suggested  cause. 

The  issue  of  stereotypy  is  raised  by  the  results  of  this 
experiment.  A  psycholinguist  who  reviewed  the  procedure 
suggested  that  slower  response  times  could  be  due  to  a 
mismatch  between  the  expected  sound  suggested  by  the  phrase 
and  the  actual  sound  presented.  The  expected  sound  would  he 
the  stereotype  held  by  the  individual  listener.  Some 
stereotypes  would  be  shared  by  most  listeners.  This  is 
probably  the  case  for  a  water  drop  sound.  The  acoustic 
'signature"  of  this  event  is  limited  in  its  permutations  by 
the  physics  of  the  event  and  it  is  unlikely  that  any 


particular  example  of  a  water  drop  would  be  inconsistent  with 
the  stereotype  held  by  any  individual.  In  other  instances, 
the  stereotype  held  by  individuals  could  vary.  For  example, 
one’s  stereotype  of  the  sound  of  a  typewriter  could  depend  on 
experience  with  different  typewriters  or  with  computer 
terminals.  An  issue  that  arises  here  is  the  specificity  of 
event  description.  If  the  event  is  specified  to  the  level  of 
kind-of-typewriter  then  the  different  stereotypes  for 
typewriting  actually  represent  different  causes.  If  the 
event  is  characterised  broadly  as  someone  typing,  then  the 
stereotype  could  be  very  different  for  different  individuals. 

Given  that  stereotypes  exist,  it  is  not  known  whether 
the  basis  of  the  stereotype  is  linguistic  or  acoustic.  The 
answer  presumes  an  understanding  of  how  the  acoustics  and 
verbal  description  of  a  sound  are  encoded  in  memory. 

Bartlett  (1977)  found  support  for  a  dual  encoding  of 
environmental  sound  and  suggested  that  these  codes  are 
similar  to  the  distinction  between  visual  and  verbal  codes. 
Bartlett  also  found  that  the  consistent  labeling  of  a  sound 
was  related  to  better  memory  recognition  performance.  This 
improvement  was  related  to  several  subtle  aspects  of 
identification  performance  in  a  signal  detection  experiment. 
Response  bias  became  more  conservative  and  the  variance  of 
the  signal+noise  distribution  was  increased  with  the  use  of 
labels.  Thus,  there  are  complex  interactions  between  the 
verbal  and  acoustic  dimensions  of  a  sound. 

Stereotypy  is  clearly  a  component  of  the  verbal 
encoding.  It  is  an  aspect  of  semantic  memory  that  is  related 
to  response  time.  The  original  work  on  semantic  memory 
networks  by  Collins  and  Qui Ilian  (1969)  which  was  based  upon 
sentence-verification  response  times  was  later  found  to  be 
confounded  by  the  stereotypy  of  the  verbal  items  employed. 
Recent  semantic  memory  models  have  incorporated  this  factor 
into  the  design  of  the  network  linkages  and  the  manner  in 
which  the  network  is  assessed.  However,  there  is  no  known 
work  on  the  role  of  stereotypy  in  the  encoding  of  the 
acoustic  properties  of  a  sound.  Future  research  must  pursue 
this  issue. 

General  Discussion 


The  most  important  finding  in  this  series  of  experiments 
is  that  the  identif iabi 1 ity  of  a  sound  can  be  reliably  and 
validly  quantified  with  the  uncertainty  measure.  Its 
validity  in  reflecting  potential  causes  was  supported  in  an 
initial  test,  although  further  testing  would  be  warranted. 

If  the  reliability  and  validity  of  the  measure  is 
established,  then  its  use  as  a  measure  of  ident i f iabi 1 ity  has 
important  methodological  and  theoretical  implications. 
Methodologically,  a  measure  such  as  this  should  guide  the 
selection  of  stimuli  for  research.  In  identification  studies 
that  employ  real  sounds  the  variation  in  the  ident i f Labi  1  ity 


of  these  sounds  must  be  considered.  For  example,  studies 
assessing  the  effectiveness  of  a  sound-classification  aid 
should  be  designed  to  avoid  the  confounding  effect  of 
variation  in  sound  identif iabi 1 ity .  An  uncontrolled 
variation  could  produce  spurious  results  if  more  recognisable 
sounds  were  used  with  one  decision  aid  and  not  with  another. 
Theoretically,  the  reliability  of  this  measure  and  its 
relationship  to  identification  time  and  identification 
confidence  (Balias  &  Howard,  in  press)  have  implications  for 
the  design  of  environmental  sound  identification  models.  The 
findings  of  these  experiments  can  be  framed  as  requirements 
for  such  models.  As  such,  these  findings  would  demand  that 
sound  identification  involves  a  cognitive  consideration  of 
alternative  causes,  most  likely  in  a  manner  sensitive  to  the 
likelihood  of  these  alternatives  and  sensitive  to  the 
similarity  in  acoustic  signatures  of  alternative  causes. 

The  models  of  Howard  and  Balias  (1983)  and  Getty  et  al. 

(1981)  are  consistent  with  these  requirements,  and  as  rioted 
earlier,  the  uncertainty  measure  is  but  a  direct  estimate  of 
the  conditional  probabilities  these  models  produce.  Direct 
estimation  is  possible  with  the  type  of  sounds  used  in  these 
experiments  because  the  listeners  have  a  long  history  of 
experience  with  these  sounds.  Thus  they  have  had  the 
opportunity  to  develop  a  knowledge  of  the  alternative  causes, 
their  perceptual  effects,  and  the  confusabi 1 ity  of  these 
effects.  It.  is  only  because  environmental  sound  constitutes 
a  familiar  domain  of  sound  that  estimates  of  these 
conditional  probabilities  can  be  made  directly. 

The  response  time  results  would  present  no  difficulty 
for  the  probabilistic  classification  models  because  the 
comparison  of  alternatives  in  these  models  is  dependent  upon 
the  number  of  alternatives  involved.  However,  the  models  are 
not  specific  about  how  the  number  of  alternatives  is  encoded 
or  retrieved,  which  alternatives  would  be  evaluated,  and  how 
they  would  be  chosen.  Memory  network  models  handle  these 
aspects  of  identification  better.  The  results  of  these 
experiments  would  be  consistent  with  a  memory  retrieval  mode L 
that  involves  two  stages,  a  search  for  causes,  and  an 
evaluation  of  these  causes.  The  first  stage  might  involve  a 
search  through  a  cognitive  network  of  causes  along  paths  that 
represent  associative  relations  at  an  acoustic  or  semantic 
level.  This  type  of  network  would  be  similar  to  the  semantic 
networks  of  memory  such  as  those  suggested  by  Anderson  and 
Bower  (1980)  and  others.  The  structure  of  such  a  network  for 
environmental  sounds  is  a  topic  that  must  be  addressed  in 
future  research. 
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